home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-11-30 | 39.9 KB | 1,106 lines |
-
-
- > From: Mark Alexander Davis-Craig <mad@merit.edu>
- >
-
- > I was looking through the web and found information on servers and
- > clients. I saw mention in the "History" section about wanting to
- > develop a good protocol for information exchange, but haven't seen a
- > paper specifically about the www protocol. Is there one? If not,
- > could you describe it in some detail?
-
- You are right that the protocol documentation was not as good as it could
- have been. I have improved it. To save you browing through the web for it,
- I append to this message the information as plain text.
-
- > I ask because we at the University of Michigan are evaluating www,
- > wais, and gopher for campus-wide information delivery.
- >
-
- I have no need to tell you what our suggestion would be! The W3 architecure
- will give you (almost) everything you can get from WAIS and Gopher rolled into one.
- The trick is that almost anything is representable by hypertext links and index searches. The
- Gopher menus and plain text, for example, are both special cases of hypertext. As it is more
- work to do the job for hypertext in general, we do not yet have software to cover as many
- platforms as Gopher, for example. However, when we do, the W3 system will be more flexible.
- Running a W3 server on top of a WAIS or Gopher world in fact makes these worlds subsets of the W3
- web. The reverse is not possible because the WAIS and Gopher information models are not flexible
- enough
- to encompass the W3 model.
-
- That said, if you want an indexer we can only recommend the wais code (or NeXT code) and we do
- not yet supply (as Gopher does) an off-the shelf index server for either of those indexes yet. It
- is easy to do, however, with our generic server code.
-
- Please keep me informed of your thinking, whether you plan to go W3 or Gopher. If we can help
- you set up a demonstration system, then mail me.
-
-
- >Thanks in advance.
- > -----------------------------------------------------------------
- > Mark Davis-Craig, Merit/MichNet Technical Support Consultant
- > mad@merit.edu mad@merit.bitnet (313)-936-2110
-
-
- Tim Berners-Lee timbl@info.cern.ch
- World Wide Web project (NeXTMail is ok)
- CERN Tel: +41(22)767 3755
- 1211 Geneva 23, Switzerland Fax: +41(22)767 7155
-
- _________________________________ protocol notes follow ___________
-
-
- The HTTP Protocol As Implemented In W3
-
-
- HTTP AS IMPLEMENTED IN WWW
-
-
- This document defines the Hypertext Transfer protocol (HTTP) as currently
- implemented by the WorldWideWeb initaitive software. This is a subset of the
- proposed full HTTP protocol. No client profile information is transferred
- with the query. Future HTTP protocols will be back-compatible with this
- protocol.
-
-
- The protocol uses the normal internet-style telnet protocol style on a
- TCP-IP link. The following describes how a client acquires a (hypertext)
- document from an HTTP server, given an HTTP document address .
-
-
- Connection
-
- The client makes a TCP-IP connection to the host using the domain name or IP
- number , and the port number given in the address.
-
-
- During development, the default HTTP TCP port number is 2784 -- this will
- change when an official port number is allocated.
-
-
- The server accepts the connection.
-
-
- Note: HTTP currently runs over TCP, but could run over any
- connection-oriented service. The interpretation of the protocol below in
- the case of a sequenced packet service (such as DECnet(TM) or ISO TP4) is
- that that the request should be one TPDU, but the repose may be many.
-
-
- Request
-
- The client sends a document request consisting of a line of ASCII characters
- terminated by a CR LF (carriage return, line feed) pair. A well-behaved
- server will not require the carriage return character.
-
-
- This request consists of the word "GET", a space, the document address ,
- omitting the "http:, host and port parts when they are the coordinates just
- used to make the connection. (If a gateway is being used, then a full
- document address may be given specifying a different naming scheme).
-
-
- The search functionality of the protocol lies in the ability of the
- addressing syntax to describe a search on a named index .
-
-
- A search should only be requested by a client when the index document
- itself has been descibed as an index using the ISINDEX tag .
-
-
- Response
-
- The response to a simple GET request is a message in hypertext mark-up
- language ( HTML ). This is a byte stream of ASCII characters.
-
-
- Lines shall be delimited by an optional carriage return followed by a
- mandatory line feed chararcter. The client should not assume that the
- carriage return will be present. Lines may be of any length. Well-behaved
- servers should retrict line length to 80 characters excluding the CR LF
- pair.
-
-
- The format of the message is HTML - that is, a trimmed SGML document. Note
- that this format allows for menus and hit lists to be returned as hypertext.
- It also allows for plain ASCII text to be returned following the PLAINTEXT
- tag .
-
-
- The message is terminated by the closing of the connection by the server.
-
-
- Well-behaved clients will read the entire document as fast as possible. The
- client shall not wait for user action (output paging for example) before
- reading the whole of the document. The server may impose a timeout of the
- order of 15 seconds on inactivity.
-
-
- Error responses are supplied in human readable text in HTML syntax. There
- is no way to distinguish an error response from a satisfactory response
- except for the content of the text.
-
-
- Disconnection
-
- The TCP-IP connection is broken by the server when the whole document has
- been transferred.
-
-
- The client may abort the transfer by breaking the connection before this,
- in which case the server will not record any error condidtion.
-
-
- Requests are idempotent . The server need not store any information about
- the request after disconnection.
-
-
- _________________________________________________________________
-
-
- Tim BL
-
-
- W3 NAMING SCHEMES
-
-
- (See also: a discussion of design issues involved , BNF syntax , W3
- background)
-
-
- The format of a hypertext name consists of the name of the naming
- sub-scheme to be used, then a name in a format particular to that subscheme,
- then an optional anchor identifier within the document. For example, the
- format is for all internet-based access methods:
-
-
- scheme : // host.domain:port / path / path # anchor
-
-
- A suffix # anchor id allows one to refer to a particular anchor within a
- document.
-
-
- A suffix ? followed by words separated by + signs allows one to seach an
- index (see details ).
-
-
- References from one document to another with a similar name may be
- abbreviated to a relative name . This imposes certain restrictions on the
- way that the "path" is represented.
-
-
- A special format is used to represent a search on an index . See also: the
- full BNF description , about escaping illegal characters .
-
-
- Examples
-
-
- file://cernvax.cern.ch/usr/lib/WWW/defaut.html#123
-
- This is a fully qualified file name, referring to a document in the file
- name space of the given internet node, and an imaginary anchor 123 within
- it.
-
-
-
- #greg
-
- This refers to anchor "greg" in the same document as that in which the name
- appears.
-
-
- Naming sub-schemes
-
- Different schemes usually use different protocols on the network. The format
- of the address after the scheme name is a function of the particular scheme.
- In practice, all internet-based schemes have a common format for the node
- name and port. Schemes currently defined are as follows, with links to
- more details.
-
-
- file Access is provided to files, using whatever means the
- browser and/or gateways have to reach files on obscure
- machines.
-
-
- news Access is provided to news articles, and newsgroups,
- normally using the NNTP protocol.
-
-
- http Access is provided to any other information using the
- HTTP search and retrieve protocol . The internal
- addressing of the information system is mapped onto a
- W3 path.
-
-
- telnet Access is provided by an interactive telnet session.
- This is provided ONLY as an interface to other
- existing online systems which cannot or have not been
- mapped onto the W3 space.
-
-
- gopher Access is provided using the "gopher" protocol. The
- gopher protocol is similar to HTTP but uses separate
- concepts of menus and text files rather than
- hypertext.
-
-
- Other schemes we foresee are wais and x500. Systems (such as WAIS) which
- are not currently accessed directly be W3 servers may be accessed though
- gateways, in which case the document address is encoded within the http
- address of the document in the gateway. Browsers which do not have the
- ability to use certain protocols may (in principle) be configured to
- automaticaly use certain gateways for certain addressing schemes.
-
-
- This will allow, for example, simple PC-based clients to follow links
- through X500 name servers.
-
-
- RELATIVE NAMING
-
-
- The address of a hypertext document is normally given within the context of
- another hypertext document. Where the addresses of the two documents are the
- similar, this allows only the difference between the two names to be given,
- saving space. An example is the address of the destination of a hypertext
- link , which is specified relative to the source document address.
-
-
- (A futher practical advantage is that a group of documents may be
- transmitted without internal changes, or accessed using more than one
- address.)
-
-
- In the WWW address format , the rules for relative naming are:
-
-
- If the "scheme" parts are different, the whole absolute address must be
- given. Other wise, the scheme is omitted, and:
-
-
- If the "host" and/or "port" parts are the different, the host name and
- all the rest of the address must be given. The host name may be given
- using internet hostname conventions, ie domains may be omitted where
- different. This is not very well defined: one tends to assume that
- if any dot is present, then the full domain name is being given, up
- to the root (.) domain, while if there are no dots, the domain is the
- same as that of the hostname part of the the base address.
-
-
- If the access and host parts are the same, then the path may be given
- with the unix convention, including the use of ".." to mean indicate
- deletion of a path element. Within the path:
-
-
- If a leading slash is present, the path is absolute. Otherwise:
-
-
- The last part of the path of the base address (e.g. the filename of the
- current document) is removed, and the given relative address appended
- in its place.
-
-
- Within the result, all occurences "xxx/.." are recursively removed,
- where xxx is one path element (directory).
-
-
- The use of the slash "/" and double dot ".." in this case must be respected
- by all servers. If necessary, this may mean converting their local
- representations in order that these characters should not appear within path
- elements (see "escaping").
-
-
- ADDRESS FOR AN INDEX SEARCH
-
-
- If a given hypertext node is an index, or the server has an index associated
- with it, then a search may be done on that index by suffixing the name of
- the index with a list of keywords, after a question mark:
-
-
-
- address_of_index ? keywordlist
-
- The address of the index is a normal hypertext address. In the keuwordlist,
- multiple keywords are separated by plus signs (+) . (See BNF syntax
- description .) The resulting string still does not contain any spaces. It
- may be considered to be the hypertext address of a document which is the
- result of making the keyword search on the index. Normally, if the search
- was successful, the document returned will contain anchors leading to other
- documents which match the selection criteria.
-
-
- The search method, and the logical and lexical functions, weights, etc
- applied to the keywords will depend on the index address. One actual index
- may have several hypertext addresses, which when searched on will behave in
- different ways. For example, one may allow a search on author-given keywords
- only, while another may be a full text search. These things particular to
- an index should be descibed in the hypertext page for the index node itself
- (or in linked documents). For example, a server may allow specific boolean
- search combinations may be represented by the words "and", "or" and "not".
-
-
- Example:
-
-
- http://cernvm/FIND/?sgml+cms
-
- indicates the result of perfoming a search for keywords "sgml" and "cms" on
- the index http://cernvm/FIND/.
-
-
- HTTP ADDRESSING
-
-
- With an access code of http:, a protocol introduced for the WWW initiative
- is used to acquire data from a server. This is the "Hypertext Transfer
- protocol", HTTP , a simple search and retrieve (S and R) protocol.
-
-
- The syntax of an http address is, with [] indicating optional parts (see
- BNF description ),
-
-
-
- http : // hostname [ : port ] / path [ ? searchwords ]
-
- for example, the following are valid addresses:
-
-
-
- http://info.cern.ch/hypertext/WWW/TheProject.html
-
- http://crnvmc.cern.ch/FIND?sgml+examples
-
- HTTP addresses conform to the WWW conventions, including the possibility of
- using the search format . The significance of the items in the path part of
- the document name is completely up to the server. Different paths may be
- used to select different databases, different views of the same database,
- etc.
-
-
- hostname This is the name of the server in internet form. A
- numeric form (e.g. 128.141.201.74) may be used, by the
- domain name form (e.g. info.cern.ch) is preferred. The
- hostname is mandatory.
-
-
- port This is a numeric port number. If a non-numeric
- string is used, it must be a defined service name.
- Note that as there is no central repository for
- service names (they are defined locaaly for each
- host), a service name is NOT an appropriate way to
- specify a port number for a hypertext address. If the
- port number is omitted the preceding colon must also
- be omitted. In this case, port number 2784 is assumed
- [This may change!].
-
-
- See also: WWW addressing in general , HTTP protocol .
-
-
- _________________________________________________________________
-
-
- Tim BL
-
-
- W3 ADDRESSES OF FILES
-
-
- The format of a hypertext reference to a file is an extension of the unix
- naming system. The full explicit format is:
-
-
- file : // node / directories / name
-
-
- The actual protocols used by the client depend on the implementation of the
- browser and the environment. Typically, the browser will check to see
- whether the node is the local node, or a node for which files are available
- mounted in some form of distributed file system. If neither of these are
- the case, then the browser may try rpc, anonymous FTP or other protocols.
-
-
- Examples
-
-
- file://cernvax.cern.ch/usr/lib/WWW/defaut.html
-
- This is a fully qualified file name.
-
-
-
- fred.html
-
- This relative name , used within a file, will refer to a file of the same
- node and directory as that file, but the name fred.html.
-
-
- Improvements : Directory access
-
- The final file name should be optional. If the address ends with a '/', the
- browser should retrieve the contents of the specified directory and generate
- a page of virtual hypertext pointing to its contents. In addition, it could
- display an information file contained in that directory, if any is present.
- Suggested file names to search for in order : README.html, *README*.html,
- README, *README*, *readme*.
-
-
-
-
-
-
- HYPERTEXT ADDRESS FOR NET NEWS
-
-
- The format of a hypertext reference to information in the internet/usenet
- news system can take any of the following forms:
-
-
- news: newsgroup This refers to a list of articles currently available
- in the given newsgroup. The newsgroup is a series of
- alphanumeric characters and dots.
-
-
- news:* This refers to a list of valid newsgroups.
-
-
- news: message_id This refers to a given article explicitly. The
- message_id is optionally surrounded by angle brackets,
- and must contain an @ sign.
-
-
-
-
-
-
- Possible extensions to this are more generous wildcarding for the list of
- newsgroups. It takes too long to load the whole list, and it would be more
- useful to be able to browse through a set of newsgroups.
-
-
- There is no way of referring to "unread" articles. Keeping track of this is
- the job of the browser.
-
-
- Examples
-
-
- news:<12345678@cernvax.cern.ch>
-
- news:12345678@cernvax.cern.ch
-
- These addresses both refer to the same (imaginary!) article by its unique
- message-id.
-
-
-
-
- news:comp.sys.next.announce
-
- This refers to a list of articles in the newsgroup comp.sys.next.announce.
- The list is, of course, a list of references to article by message-id.
-
-
- TELNET ADDRESSING
-
-
- A telnet address is a spcecial case of a W3 address.
-
-
- When a telnet address is used, information can only be rertrieved using an
- interactive telnet session. This has the disadvantage that information
- cannot be indexed, searched, etc automatically, nor can it be gatewayed into
- other systems. The telnet addressing form is used to allow a pointer to
- information systems such as library information systems which have not been
- gatewayed into the web properly yet.
-
-
- The syntax is, with [] indicating optional parts (see full BNF)
-
-
-
- telnet : / / [ user @ ] host [ : port ]
-
- There should be no spaces. For example, the following are valid telnet
- addresses:
-
-
-
- telnet://www@info.cern.ch:23
-
- telnet://www@info.cern.ch
-
- telnet://info.cern.ch
-
- user is the optional name of the user to be used for login.
- If the username is omitted, then so must be the "@"
- sign. This is equivalent to the argument used with the
- -l option on the ucb telnet command. When the username
- is omitted, some access servers will prompt for a
- username and password.
-
-
- host This is the name of the server in internet form. A
- numeric form (e.g. 128.141.201.74) may be used, by the
- domain name form (e.g. info.cern.ch) is preferred.
- The host is mandatory.
-
-
- port This is a numeric port number. If a non-numeric string
- is used, it must be a defined service name. Note that
- as there is no central repository for service names
- (they are defined locaaly for each host), a service
- name is NOT an appropriate way to specify a port
- number for a hypertext address. If the port number is
- omitted the preceding colon must also be omitted. In
- this case, port number 23 is assumed.
-
-
- _________________________________________________________________
-
-
- Tim BL
-
-
- GOPHER ADDRESSING
-
-
- Gopher addresses indicate that the gopher protocol should be used to access
- the information. The Gopher protocol is a simple internet protocol similar
- to HTTP . It allows the transfer of menus or plain text files. (HTTP
- expresses both menus and plain text files as special cases of hypertext
- files). See the gopher protocol notes .
-
-
- The syntax is, with [] indicating optional parts (see BNF )
-
-
-
- gopher:// hostname [: port ] [/gtype/ [selector] ] [ ? search ]
-
- There should be no spaces. For example, the following are valid addresses:
-
-
-
- gopher://gopher.micro.umn.edu:70
-
- gopher://gopher.micro.umn.edu:70/1/
-
- gopher://gopher.micro.umn.edu:70
-
- The W3 address for a gopher item may be derived from the fields of a gopher
- menu line which has the format
-
-
- host This is the name of the server in internet form. A
- numeric form (e.g. 128.141.201.74) may be used, by the
- domain name form (e.g. info.cern.ch) is preferred. The
- hostname is mandatory.
-
-
- port This is a numeric port number. If a non-numeric
- string is used, it must be a defined service name.
- Note that as there is no central repository for
- service names (they are defined locaaly for each
- host), a service name is NOT an appropriate way to
- specify a port number for a hypertext address. If the
- port number is omitted the preceding colon must also
- be omitted. In this case, port number 70 is assumed.
-
-
- gtype This is a gopher item type number, a (hopefully
- printable!) ASCII character. Currently these types
- are all ASCII decimal digit characters. Character "0"
- (hex 30) signifies a plain text file. Character "1"
- signifies a Menu. Character "7" signifies a
- searchable index. Character "8" should not be used in
- a W3 address: use telnet addressing instead. In
- general W3 terms, the type is the first part of the
- path. The rest of the path is the gopher selector
- string. The type field is a hint to the client as to
- how to represent the anchor, and how to follow it.
-
-
- selector This is the string to be sent to the gopher server to
- identify the information required.
-
-
- _________________________________________________________________
-
-
- Tim BL
-
-
- ESCAPING ILLEGAL CHARACTERS
-
-
- The W3 address syntax allows a path to contain most printable ASCII
- characters, but some are inevitably used for punctuation are excluded. W3
- addresses are sometimes used to represent addresses in some other space.
- This happens when an HTTP server, for example, uses file names as its
- document names, or when addresses from some other protocol (Gopher, WAIS,
- etc) are mapped into the W3 web.
-
-
- In these cases, a convention is normally used to map illegal characters in
- these "foreign" names onto the allowed set.
-
-
- In the case of an HTTP server, any mapping may be used.
-
-
- A suitable convention is that a percent sign (%) followed by two
- hexadecimal digits (0-9 or a-f) stands for the single character with ASCII
- hexadecimal code represented by those two digits (Most significant digit
- first).
-
-
- A percent sign itself must therefore be represented by %25, as 25 hex is
- the ASCII code for "%".
-
-
- _________________________________________________________________
-
-
- Tim BL
-
-
- W3 ADDRESS SYNTAX: BNF
-
-
- This is a BNF-like description of the W3 addressing syntax . We use a
- vertical line "|" to indicate alternatives, and [brackets] to indicate
- optional parts. Spaces are representational only: no spaces are actually
- allowed within a W3 address. Single letters stand for single letters. All
- words of more than one letter below are entites described elsewhere in the
- syntax description. (Entity names are here linked to their definitions,
- probably making this unreadable with the line mode browser.)
-
-
- An absolute address specified in a link is an anchoraddress . The address
- which is passed to a server is a docaddress .
-
-
- anchoraddress docaddress [ # anchor ]
-
-
- docaddress httpaddress | fileaddress | newsaddress |
- telnetaddress | gopheraddress
-
-
- httpaddress h t t p : / / hostport [ / path ] [ ? search ]
-
-
- fileaddress f i l e : / / host / path
-
-
- newsaddress n e w s : groupart
-
-
- groupart * | group | article
-
-
- group ialpha [ . group ]
-
-
- article xalphas @ host
-
-
- telnetaddress t e l n e t : / / [ user @ ] hostport
-
-
- gopheraddress g o p h e r : / / hostport [/ gtype [ / selector ]
- ] [ ? search ]
-
-
- hostport host [ : port ]
-
-
- host hostname | hostnumber
-
-
- hostname ialpha [ . hostname ]
-
-
- hostnumber digits . digits . digits . digits
-
-
- port digits
-
-
- selector path
-
-
- path void | xalphas [ / path ]
-
-
- search xalphas [ + search ]
-
-
- user xalphas
-
-
- anchor xalphas
-
-
- gtype xalpha
-
-
- xalpha alpha | $ | _ | @ | ! | % | ^ | | * | ( | ) | . |
- digit
-
-
- xalphas xalpha [ xalphas ]
-
-
- ialpha alpha [ xalphas ]
-
-
- alpha a | b | c | d | e | f | g | h | i | j | k | l | m | n
- | o | p | q | r | s | t | u | v | w | x | y | z | A |
- B | C | D | E | F | G | H | I | J | K | L | M | N | O
- | P | Q | R | S | T | U | V | W | X | Y | Z
-
-
- digit 0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
-
-
- digits digit [ digits ]
-
-
- alphanum alpha | digit
-
-
- alphanums alphanum [ alphanums ]
-
-
- void
-
-
- See also: General description of this syntax, Escaping conventions.
-
-
- _________________________________________________________________
-
-
- Tim BL
-
-
- HTML
-
-
- The WWW system uses marked-up text to represent a hypertext document for
- transmision over the network. The hypertext mark-up language is an SGML
- format. This defines the basic syntax used. The particular language, the set
- of tags and the rules about their use, and their significance is not part of
- the SGML standard. There being no standard on this, we have adopted a set
- which seems sensible. We call them HTML -- hypertext markup language. HTML
- is not an alternative to SGML, it is a particular format within the SGML
- rules (an SGML "DTD"). HTML parsers should ignore tags which they do not
- understand, and ignore attributes which they do not understand of tags which
- they do understand.
-
-
- See also:
-
-
- The tags A list of the tags used in HTML with their
- significance.
-
-
- Example A file containing a variety of tags used for test
- purposes.
-
-
- Default text
-
- Unless otherwise defined by tags, text is transmitted as a stream of lines.
- The division of the stream of characters into lines is arbitrary, and only
- made in order to allow the text to be passed through systems which can only
- handle text with a limited line length. The recommended line length for
- transmission is 80 characters. The division into lines has no significance
- (except in the case of example sections and PLAINTEXT ) apart from
- indicating a word end. Line breaks between tags have no significance.
-
-
- HTML TAGS
-
-
- This is a list of tags used in the HTML language. Each tag starts with a
- tag opener (a less than sign) and ends with a tag closer (a greater than
- sign). Many tags have corresponding closing tags which identical except
- for a slash after the tag opener. (For example, the TITLE tag).
-
-
- Some tags take parameters, called attributes. The attributes are given
- after the tag, separated by spaces. Certain attributes have an effect simply
- by their presence, others are followed by an equals sign and a value. (See
- the Anchor tag, for example). The names of tags and attributes are not case
- sensitive: they may be in lower, upper, or mixed case with exactly the same
- meaning. (In this document they are generally represented in upper case.)
-
-
- Currently HTML documents are transmitted without the normal SGML framing
- tags, but if these are included parsers will ignore them.
-
-
- Title
-
- The title of a document is given between title tags:
-
-
-
- <TITLE> ... </TITLE>
-
- The text between the opening and the closing tags is a title for the
- hypertext node. There should only be one title in any node. It should
- identify the content of the node in a fairly wide context, and should
- ideally fit on one line.
-
-
- The title is not strictly part of the text of the document, but is an
- attribute of the node. It may not contain anchors, paragraph marks, or
- highlighting. the title may be used to identify the node in a history list,
- to label the window displaying the node, etc. It is not normally displayed
- in the text of a document itself. Contrast titles with headings .
-
-
- Next ID
-
- This tag takes a single attribute which is the number of the next
- document-wide numeric identifier to be allocated (not good SGML). Note that
- when modifying a document, old anchor ids should not be reused, as there
- may be references stored elsewhere which point to them. This is read and
- generated by hypertext editors. Human writers of HTML usually use mnemonic
- alpha identifiers. Browser software may ignore this tag. Example of use:
-
-
-
- <NEXTID 27>
-
- Base Address
-
- Anchors specify addresses of other documents, in a from relative to the
- address of the current document. Normally, the address of a document is
- known to the browser because it was used to access the document. However, is
- a document is mailed, or is somehow visible with more than one address (for
- example, via its filename and also via its library name server catalogue
- number), then the browser needs to know the base address in order to
- correctly deduce external document addresses.
-
-
- The format of this tag is not yet specified.
-
-
- Anchors
-
- The format of an anchor is as follows:
-
-
-
- <A NAME=xxx HREF=XXX> ... </A>
-
- The text between the opening tag and the closing tag is either the start or
- destination (or both) of a link. Attributes of the anchor tag are as
- follows.
-
-
- HREF If the HREF attribute is present, the anchor is
- senstive text: the start of a link. If the reader
- selects this text, he should be presented with
- another document whose network address is defined by
- the value of the HREF attribute . The format of the
- network address is specified elsewhere . This allows
- for the form HREF=#identifier to refer to another
- anchor in the same document. If the anchor is in
- another document, the atribute is a relative name ,
- relative to the documents address (or specified base
- address if any).
-
-
- NAME The attribute NAME allows the anchor to be the
- destination of a link. The value of the parameter is
- that part of a hypertext address which follows the
- hash sign.
-
-
- TYPE An attribute TYPE may give the relationship described
- by the hyertext link. The type is expressed by a
- string for extensibility. Strings for types with
- particular semantics will be registered by the W3
- team. The default relationship if none other is given
- is void.
-
-
- All attributes are optional, although one of NAME and HREF is necessary for
- the anchor to be useful.
-
-
- IsIndex
-
- This tag informs the reader that the document is an index document. As well
- as reading it, the reader may use a keyword search.
-
-
- Format:
-
-
-
- <ISINDEX>
-
- The node may be queried with a keyword search by suffixing the node address
- with a question mark, followed by a list of keywords separated by plus
- signs. See the network address format.
-
-
- Plaintext
-
- This tag indicates that all following text is to be taken litterally, up to
- the end of the file. Plain text is designed to be represented in the same
- way as example XMP text, with fixed width character and significant line
- breaks. Format:
-
-
-
- <PLAINTEXT>
-
- This tag allows the rest of a file to be read efficiently without parsing.
- Its presence is an optimisation. There is no closing tag.
-
-
- Example sections
-
- These styles allow text of fixed-width characters to be embedded absolutely
- as is into the document. The format is:
-
-
-
- <LISTING>
-
- ...
-
- </LISTING>
-
- The text between these tags is to be portrayed in a fixed width font, so
- that any formatting done by character spacing on successive lines will be
- maintained. Between the opening and closing tags:
-
-
- The text may contain any ISO Latin printable characters, including the
- tag opener, so long as it does not contain the closing tag in full.
-
-
- Line boundaries are significant, and are to be interpreted as a move to
- the start of a new line.
-
-
- The ASCII Horizontal Tab (HT) character should be interpreted as the
- smallest positive nonzero number of spaces which will leave the
- number of characters so far on the line as a multiple of 8. Its use
- is not recommended however.
-
-
- The LISTING tag is portrayed so that at least 132 characters will fit on a
- line. The XMP tag is portrayed in a font so that at least 80 characters
- will fit on a line but is otherwise identical to LISTING. The examples of
- markup are here given using the XMP tag.
-
-
- Paragraph
-
- This tag indicates a new paragraph. The exact representation of this
- (indentation, leading, etc) is not defined here, and may be a function of
- other tags, style sheets etc. The format is simply
-
-
-
- <P>
-
- (In SGML terms, paragraph elements are transmitted in minimised form).
-
-
- Headings
-
- Several levels (at least six) of heading are supported. Note that a
- hypertext document tends to need less levels of heading than a normal
- document whose only structure is given by the nesting of headings. H1 is the
- highest level of heading, and is recommened for the start of a hypertext
- node. It is suggested that the first heading be one suitable for a reader
- who is already browsing in related information, in contrast to the title tag
- which should identify the node in a wider context.
-
-
-
- <H1>, <H2>, <H3>, <H4>, <H5>, <H6>
-
- These tags are kept as defined in the CERN SGML guide. Their definition is
- completely historical, deriving from the AAP tag set. A difference is that
- HTML documents allow headings to be terminated by closing tags:
-
-
-
- <H2>Second level heading</h2>
-
- Highlighting
-
- The highlighted phrase tags may occur in normal text, and may be nested. For
- each opening tag there must follow a corresponding closing tag. NOT
- CURRENTLY USED.
-
-
-
-
- <HP1>...</HP1> <HP2>... </HP2> etc.
-
- Glossaries
-
-
- A glosary (or definition list) is a list of paragraphs each of which has a
- short title alongside it. Apart from glossaries, this format is useful for
- presenting a set of named elements to the reader. The format is as follows:
-
-
-
-
- <DL>
-
- <DT>Term<DD>definition pagagraph
-
- <DT>Term2<DD>Definition of term2
-
- </DL>
-
- Lists
-
-
- A list is a sequence of paragraphs, each of which is preceded by a special
- mark or sequence number. The format is:
-
-
-
-
- <UL>
-
- <LI> list element
-
- <LI> another list element ...
-
- </LI>
-
- The opening list tag (UL for an unordered list, OL for an ordered one) must
- be immediately followed by the first list element. The representation of the
- list is not defined here, but a bulleted list for unordered lists, and a
- sequence of numbered paraghraphs for an ordered list would be quite
- appropriate.
-
-
- "OL" IS NOT CURRENTLY USED
-
-
-
-